An Assessment of the Accuracy of Automatic Evaluation in Summarization

نویسندگان

  • Karolina Owczarzak
  • John M. Conroy
  • Hoa Trang Dang
  • Ani Nenkova
چکیده

Automatic evaluation has greatly facilitated system development in summarization. At the same time, the use of automatic evaluation has been viewed with mistrust by many, as its accuracy and correct application are not well understood. In this paper we provide an assessment of the automatic evaluations used for multi-document summarization of news. We outline our recommendations about how any evaluation, manual or automatic, should be used to find statistically significant differences between summarization systems. We identify the reference automatic evaluation metrics— ROUGE 1 and 2—that appear to best emulate human pyramid and responsiveness scores on four years of NIST evaluations. We then demonstrate the accuracy of these metrics in reproducing human judgements about the relative content quality of pairs of systems and present an empirical assessment of the relationship between statistically significant differences between systems according to manual evaluations, and the difference according to automatic evaluations. Finally, we present a case study of how new metrics should be compared to the reference evaluation, as we search for even more accurate automatic measures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Systematic literature review of fuzzy logic based text summarization

Information Overloadrq  is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq    informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...

متن کامل

Computational Linguistics: Human Language Technologies Proceedings of the Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

Automatic evaluation has greatly facilitated system development in summarization. At the same time, the use of automatic evaluation has been viewed with mistrust by many, as its accuracy and correct application are not well understood. In this paper we provide an assessment of the automatic evaluations used for multi-document summarization of news. We outline our recommendations about how any e...

متن کامل

بهبود خلاصه سازی خودکار متون فارسی با استفاده از روش‌های پردازش زبان طبیعی و گراف شباهت

A significant amount of available information is stored in textual databases which contains a large collection of documents from different sources (such as news, articles, books, emails and web pages). The increasing visibility and importance of this class of information motivates us to work on having better automatic evaluation tools for textual resources. The automatic summarization of tex...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012